Adding Stochastic Negative Examples into Machine Learning Improves Molecular Bioactivity Prediction
نویسندگان
چکیده
منابع مشابه
Teaching machine learning from examples
We used the datasets of the NIPS 2003 challenge on feature selection as part of the practical work of an undergraduate course on feature extraction. The students were provided with a toolkit implemented in Matlab. Part of the course requirements was that they should outperform given baseline methods. The results were beyond expectations: the student matched or exceeded the performance of the be...
متن کاملLearning When Negative Examples Abound
Existing concept learning systems can fail when the negative examples heavily outnumber the positive examples. The paper discusses one essential trouble brought about by imbalanced training sets and presents a learning algorithm addressing this issue. The experiments (with synthetic and real-world data) focus on 2-class problems with examples described with binary and continuous attributes.
متن کاملA Machine Learning Researcher's Foray into Recidivism Prediction
We discuss an application of machine learning to recidivism prediction. Our initial results motivate the need for a methodology for technique selection for applications that involve unequal but unknown error costs, a skewed data set, or both. Evaluation methodologies traditionally used in machine learning are inadequate for analyzing performance in these situations, although they arise frequent...
متن کاملTarget prediction utilising negative bioactivity data covering large chemical space
BACKGROUND In silico analyses are increasingly being used to support mode-of-action investigations; however many such approaches do not utilise the large amounts of inactive data held in chemogenomic repositories. The objective of this work is concerned with the integration of such bioactivity data in the target prediction of orphan compounds to produce the probability of activity and inactivit...
متن کاملMetabolite identification and molecular fingerprint prediction through machine learning
MOTIVATION Metabolite identification from tandem mass spectra is an important problem in metabolomics, underpinning subsequent metabolic modelling and network analysis. Yet, currently this task requires matching the observed spectrum against a database of reference spectra originating from similar equipment and closely matching operating parameters, a condition that is rarely satisfied in publi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Chemical Information and Modeling
سال: 2020
ISSN: 1549-9596,1549-960X
DOI: 10.1021/acs.jcim.0c00565